Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDRP-781: Unit tests for remaining staging helpers #249

Draft
wants to merge 23 commits into
base: develop
Choose a base branch
from

Conversation

JenCheshire
Copy link
Collaborator

@JenCheshire JenCheshire commented Jun 24, 2024

Ticket changed: hard coded path were relative so just unit tests for:
load_validate_mapper()
load_val_snapshot_json()

Removed as no longer needed:
load_validate_secondary_snapshot()

Pull Request submission

Insert detailed bullet points about your changes here!

Insert any instructions to help the reviewer, e.g. "install new requirements from requirements.txt"

*Let the reviewer know what data files are needed (and if applicable, where they are to be found)

Closes or fixes

  • Detail the ticket(s) you are closing with this PR
    Closes #

Code

  • Code runs The code runs on my machine and/or CDSW
  • Conflicts resolved There are no conflicts (I have performed a rebase if necessary)
  • Requirements My/our code functions according to the requirements of the ticket
  • Dependencies I have updated the environment yaml so it includes any new libraries I have used
  • Configuration file updated any high level parameters that the user may interact with have been put into the config file (and imported to the script)
  • Clean Code
    • Code is as PEP 8 compliant as I can humanly make it
    • Code passess flake8 linting check
    • Code adheres to DRY
  • Type hints All new functions have type hints

Documentation

Any new code includes all the following forms of documentation:

  • Function Documentation Docstrings within the function(s')/methods have been created
    • Includes Args and returns for all major functions
    • The docstring details data types
  • Updated Documentation: User and/or developer working doc has been updated

Data

  • All data needed to run this script is available in Dev/Test
  • All data is excluded from this pull request
  • Secrets checker pre-commit passes

Testing

  • Unit tests Unit tests have been created and are passing or a new ticket to create tests has been created

Peer Review Section

  • All requirements install from (updated) requirements.txt
  • Documentation has been created and is clear - check the working document
  • Doctrings (Google format) have been created and accurately describe the function's functionality
  • Unit tests pass, or if not present a new ticket to create tests has been created
  • Code runs The code runs on reviewer's machine and/or CDSW

Final approval (post-review)

The author has responded to my review and made changes to my satisfaction.

  • I recommend merging this request.

Review comments

Insert detailed comments here!

These might include, but not exclusively:

  • bugs that need fixing (does it work as expected? and does it work with other code
    that it is likely to interact with?)
  • alternative methods (could it be written more efficiently or with more clarity?)
  • documentation improvements (does the documentation reflect how the code actually works?)
  • additional tests that should be implemented (do the tests effectively assure that it
    works correctly?)
  • code style improvements (could the code be written more clearly?)
  • Do the changes represent a change in functionality so the version number should increase? Start a discussion if so.
  • As a review you can generates the same outputs from running the code

Your suggestions should be tailored to the code that you are reviewing.
Be critical and clear, but not mean. Ask questions and set actions.

Copy link

github-actions bot commented Jun 24, 2024

Percentage Coverage for this PR

Detailed Coverage Report
FileStmtsMissCoverMissing
src
   __init__.py00100% 
src/aggregation
   __init__.py00100% 
   aggregation_main.py00100% 
src/construction
   __init__.py00100% 
   all_data_construction.py48480%1–2, 4–5, 7, 9, 17, 23, 46–47, 50, 55–56, 60–62, 66–67, 70, 77–79, 84, 87–91, 93, 98–99, 104–105, 108–110, 114, 118–121, 126–127, 129, 134–135, 139, 141
   construction_helpers.py1041486%31–42, 237, 304
   construction_main.py27270%2–3, 5, 7, 11–12, 15, 18, 54–57, 64, 68–71, 74, 78–81, 84, 89–91, 93
   construction_read_validate.py34340%2–3, 5, 8, 11, 16, 18, 21, 46–49, 51–52, 55–56, 59, 66–68, 72–73, 79, 82, 101–103, 105–106, 109, 116–117, 122, 129
   construction_validation.py102991%83, 86, 113, 121–122, 127, 138, 168, 218
   postcode_construction.py21210%1–2, 4–6, 9, 26, 29, 32, 37, 40–41, 44, 47–49, 51, 55, 59, 62–63
src/estimation
   __init__.py00100% 
   apply_weights.py18288%44–45
   calculate_weights.py520100% 
   estimation_main.py24240%2–5, 7–8, 10, 13, 32, 38, 41, 44–45, 47–56, 58
src/freezing
   __init__.py00100% 
   freezing_apply_changes.py732171%36–39, 42–43, 46–48, 51–54, 58, 66–69, 73–74, 76
   freezing_compare.py722466%111–112, 160–161, 187, 190–192, 195–197, 201–203, 205–207, 209–210, 242, 267, 270, 273, 276
   freezing_main.py49490%1–4, 6, 8–12, 15, 18, 44, 47, 50–51, 53–57, 59, 68, 71–73, 76–77, 81–83, 88–89, 91–92, 95–98, 101, 105, 108, 119–122, 124–126
   freezing_utils.py110100% 
src/imputation
   MoR.py141596%45–46, 94, 313, 463
   __init__.py00100% 
   apportionment.py361072%124, 126, 141, 143, 145, 157–159, 162, 164
   expansion_imputation.py373018%23–24, 26–27, 30, 33–35, 39, 42, 44, 47–48, 50, 53, 56, 76–78, 82, 85, 88, 91, 97, 102, 105, 107, 113, 116, 120
   imputation_helpers.py1205157%39, 52–55, 58–59, 61–62, 64–65, 67–68, 70–71, 73–74, 76–77, 79, 88, 90–91, 93–94, 263, 265–266, 270–271, 273, 277, 298, 301, 305, 308–309, 311, 396, 398–399, 402, 414–415, 417, 419, 435–436, 439–440, 442
   imputation_main.py75750%2–6, 8–17, 20, 23, 57, 60, 64–66, 69–70, 73–74, 77, 79, 84, 87–88, 91, 94–95, 98, 100, 103, 106, 110–111, 113, 117–118, 121, 124–126, 128, 132, 135–136, 138–142, 145–146, 151–153, 155–159, 162, 165, 168–173, 175
   impute_civ_def.py984752%164, 167, 197, 199–200, 203–204, 207–209, 214–215, 217, 220, 222–225, 227, 229, 234–235, 237, 240, 242–244, 247, 249–250, 252–253, 255–256, 269, 272–276, 278–279, 282–283, 285–286, 288
   manual_imputation.py19190%1–2, 4, 6, 9–10, 28–29, 33, 35, 37, 44, 59, 61–62, 64–65, 67–68
   sf_expansion.py746117%24, 27, 30–31, 34–35, 38, 41, 44–45, 48, 50, 57–58, 61, 66–67, 70, 72, 75–76, 78, 80, 83, 87, 89, 100, 103, 107, 109–110, 112, 114, 117, 120, 129, 132, 135, 144, 147, 149, 155, 161, 173, 176, 181–182, 184, 192–193, 196, 200, 204, 207, 215, 220–221, 224, 226, 230, 232
   short_to_long.py21210%1, 3–4, 7, 20, 23, 25, 32–33, 35, 37–38, 40–41, 43–45, 47, 49, 53, 55
   tmi_imputation.py19112534%105, 107, 113–115, 119, 125, 128, 130, 135, 140, 145, 266, 269, 271, 274, 278, 282, 304–305, 308, 311–312, 315, 318–319, 322, 324–325, 327, 329, 332–333, 335, 337, 340, 343–344, 346, 349–351, 353–356, 358, 374, 376, 383, 385–386, 388–389, 391, 393, 395, 397, 400, 403–404, 407, 409, 411, 429–430, 432, 434–435, 437–439, 441–442, 445, 447–448, 451, 453–454, 472, 474, 477–479, 481, 483–484, 486–487, 490, 492–493, 496, 498, 500–501, 521, 525, 528–529, 532–533, 536, 539, 541, 546, 548, 553, 555, 562–563, 568, 573–574, 576, 579–580, 582, 585, 587, 589, 594, 596–597
src/mapping
   __init__.py00100% 
   cellno_mapping.py17476%49–51, 53
   itl_mapping.py100100% 
   mapping_helpers.py55590%24–27, 69
   mapping_main.py47470%2–5, 7–13, 15, 18, 45, 50, 59, 68, 75, 78, 85, 88–90, 97, 99, 102, 104–106, 109, 112, 117–118, 121–123, 125–127, 130–131, 133–135, 138, 141, 144
   pg_conversion.py40880%52, 55, 58, 122, 125, 128, 162–163
   ultfoc_mapping.py34488%49–51, 95
src/northern_ireland
   __init__.py00100% 
   ni_headcount_fte.py29290%2–4, 6, 8, 11, 25, 27, 29, 31, 33, 35, 37, 40, 55, 57–59, 61, 63–64, 66–67, 69, 72, 81, 83–84, 86
   ni_main.py17170%3–8, 10, 13, 35–36, 44–46, 54, 56–57, 59
   ni_staging.py35350%3–7, 9, 11, 14, 22, 25, 27–29, 31, 34, 40, 44, 46, 49, 80–81, 84, 89, 93, 96, 99–104, 107–108, 110, 112
src/outlier_detection
   __init__.py00100% 
   auto_outliers.py810100% 
   manual_outliers.py160100% 
   outlier_main.py40400%2–5, 7–8, 10, 13, 44, 46–48, 50–51, 54, 57, 60–64, 67–68, 70, 75–76, 79, 84–87, 90–94, 96, 99, 101, 103
src/outputs
   __init__.py00100% 
   export_files.py89890%5–11, 13–14, 18, 24, 42, 44, 51, 56, 63, 68, 71, 98, 104, 107, 114, 116, 119, 125, 127–128, 131–135, 138–139, 142, 153–155, 157, 160, 173, 175–176, 178, 181, 201, 204, 207–208, 215, 219, 222–224, 227, 229, 231, 233–234, 236, 239–243, 245–246, 248, 251–253, 256, 259, 262, 276, 279–280, 288, 291, 293, 295, 305, 307–309, 318, 320, 323–324
   form_output_prep.py181327%25–27, 29–30, 34, 36–37, 39, 41, 43, 47, 49
   frozen_group.py45450%3–4, 6–7, 9–11, 13, 16, 47, 49–52, 55, 73, 125, 136, 163, 166–170, 172, 174, 177, 180–181, 183, 186, 189, 194–195, 198–199, 202, 205–207, 210–213, 215
   gb_sas.py25250%2–5, 7–9, 11, 14, 32, 35, 40, 43, 46, 49, 54, 57, 60–62, 65–68, 70
   intram_by_civil_defence.py23230%2–6, 8, 11, 30, 32–33, 36–37, 39, 42, 47, 50, 53–55, 58–61
   intram_by_itl.py551670%46–50, 54, 146, 149, 152–154, 157–158, 161–162, 172
   intram_by_pg.py310100% 
   intram_by_sic.py35350%2–5, 7, 10, 32–33, 36–37, 40–41, 43, 46–48, 53, 69–73, 75, 78, 81, 84, 87, 90–91, 94–97, 100, 102
   intram_totals.py16160%3–4, 6–7, 9, 12, 31–32, 35, 38, 41–42, 44–47
   long_form.py19190%2–5, 7–9, 11, 14, 31, 34, 37, 40–42, 44–47
   manifest_output.py78780%1–4, 8, 11–12, 15, 33, 48–51, 54–55, 59–60, 65–66, 68, 71–75, 78–84, 86, 104–105, 110, 112–113, 118, 121, 123, 125, 127, 131, 141, 146–147, 153, 156–157, 159–160, 166–169, 174, 181, 183, 188, 190–192, 194–195, 197–198, 200–203, 205, 208, 210, 216–217, 220–221
   map_output_cols.py38781%150–151, 153, 155, 158, 161, 163
   ni_sas.py21210%2–8, 10, 13, 28, 31, 34, 37, 42, 45–47, 50–53
   outputs_helpers.py170100% 
   outputs_main.py83830%3–4, 7, 10–22, 24, 27, 53, 58–60, 66, 69, 72–74, 80, 83–84, 87–89, 96, 99–101, 104, 107–109, 111–112, 118, 121–123, 133, 136–138, 142–143, 153, 156–158, 166, 169–171, 175–176, 185, 188–190, 198, 201–203, 210, 213–215, 223, 226–228, 230–232, 234
   short_form.py361558%78, 85, 87, 105, 108, 111, 114, 117, 120–122, 124–127
   tau.py27270%2–8, 10, 13, 33, 37, 40, 43, 46, 49, 54, 57, 59–60, 63–65, 68–71, 73
   total_fte.py14140%2–5, 8, 11, 24, 26–27, 33, 38–41
src/site_apportionment
   __init__.py00100% 
   output_status_filtered.py281739%24, 26, 30–31, 33–34, 36, 43–44, 72, 75, 77, 79–82, 84
   site_apportionment.py1242778%97, 99–105, 458, 460–461, 504, 508, 510, 513, 516, 519, 522, 534, 537, 540, 543, 546, 549, 552, 555, 557
   site_apportionment_main.py25250%2–5, 7–8, 13, 16, 41, 43, 46–47, 50–51, 53–54, 57, 62–67, 69–70
src/staging
   __init__.py00100% 
   postcode_validation.py56394%131–132, 220
   spp_parser.py140100% 
   spp_snapshot_processing.py340100% 
   staging_helpers.py861582%171, 174, 176, 179–180, 183, 186, 188–189, 192, 196–197, 199, 203, 209
   staging_main.py1071070%3–6, 8, 10–11, 15, 18, 66–68, 71, 73–76, 78–79, 83, 85–86, 89, 92, 95–96, 98–99, 104–106, 110–113, 115–116, 122, 127–129, 131–132, 135, 148–150, 155, 158, 162, 164, 166–168, 171, 175, 177, 179–183, 186, 189, 191–192, 195, 197–202, 205, 209–210, 213–217, 219, 222, 231, 239, 249–250, 252–257, 260–261, 263, 266–271, 274–275, 277, 280, 292
   validation.py1384666%33, 36, 85–86, 97, 119, 132–133, 138, 142, 150–151, 157–158, 197–198, 204, 206, 209, 211, 218–219, 298, 300–301, 304–305, 308–309, 312, 315–316, 319, 321, 323–324, 338, 340–346, 349, 352
src/utils
   __init__.py00100% 
   breakdown_validation.py1304565%29, 186–188, 190–194, 196, 198, 201–202, 208–209, 212, 215, 217, 220, 222, 234–237, 240–243, 259–260, 265, 308–309, 311, 317–318, 360–363, 365–368, 371
   config.py149298%157–158
   defence.py200100% 
   helpers.py631871%19–20, 24–25, 27, 133–136, 138–139, 142–144, 153, 155–156, 214
   local_file_mods.py1184561%48–54, 99, 149–151, 200–204, 215–216, 227, 238, 249–250, 252, 263–264, 275–276, 285, 293, 304, 306–307, 309–310, 314–316, 320, 334–335, 338–339, 341, 358, 363
   path_helpers.py127893%56, 269, 271, 273–274, 276–277, 279
   postcode_reduction_helper.py38380%11, 13–14, 16, 19, 22, 25, 27–28, 30–31, 34–35, 38, 40–41, 43–44, 47, 60, 62, 65, 68, 71–72, 74–75, 77, 80, 95, 97, 100–101, 104–105, 107, 109–110
   s3_mods.py1271270%21–22, 26–27, 31, 41, 45–47, 51, 65, 67–68, 71–74, 76–77, 80, 91, 94, 99, 102, 105, 108, 119–120, 122, 125, 139, 143–144, 146, 149, 159, 166, 169, 171, 174, 176, 179, 190–191, 193, 196, 204–205, 208, 217–218, 222–225, 228, 239–240, 243–244, 247, 252, 255, 266–267, 269–272, 274–275, 278, 284, 287, 300, 303, 306, 308, 311, 317, 320, 323, 326, 329, 337–339, 341, 344, 348–349, 351, 354, 383, 387, 389, 396, 399, 404–406, 413, 416, 431, 433–434, 436–437, 441, 443, 445, 448, 450–452, 454, 457, 467, 470–471, 474, 477, 479–480, 483–485
   singleton_boto.py20200%4–5, 8–10, 12–13, 15–19, 23–25, 27–31
TOTAL3942197549% 

Summary of tests

Tests Skipped Failures Errors Time
248 2 💤 0 ❌ 0 🔥 3.447s ⏱️

@JenCheshire JenCheshire marked this pull request as ready for review June 25, 2024 09:30
@JenCheshire JenCheshire requested review from CBROWN-ONS, zorge69 and AnneONS and removed request for zorge69 June 25, 2024 09:30
@JenCheshire
Copy link
Collaborator Author

Hi @CBROWN-ONS and @AnneONS - I've requested a review from you both. Really keen for any ideas to make these better. The new tests are testing functions that contain paths to the config. If they are run in full, they go to real configs which contain all the variables so they error on the test data. I couldn't find a way to avoid this, however, because those functions are already unit tested, I have concentrated on checking that the appropriate functions are called within these tests. Grateful for your thoughts. Thanks

Copy link
Collaborator

@CBROWN-ONS CBROWN-ONS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jen.

I've not come across much of the unittest package. so this was a nice thing for me to learn reading through your code :).

While the mocking is neat, and seems to work as intended, I have some suggestions for improving the maintainability and robustness of the tests.

To begin with, there are functions within the load_validate_mapper function that aren't being included in the mock tests, where as some are. This is rather inconsistent in a way.

Next, the function is able to be tested using real data, rather than mocks. For example, you are able to write out schemas/configs/jsons to temp paths, and read them back in, using pytest. This allows for real testing of the function, ensuring it is able to act as intended, and not just call the functions. This will more accurately test that the function is acting in the correct way, making the tests more secure.

Using pytest would also align it with the rest of the tests in this module.

In general, the main processes of this function include reading from files, therefore it seems like a requirement that this is happening during the tests, rather than mocks being implemented.

More tests can also be included, such as checking that the messages are being written to the logger. This is unrelated to the mocking approach, and more of a feature request.

Thanks, I'm happy to answer any questions you might have.

tests/test_staging/test_staging_helpers.py Outdated Show resolved Hide resolved
tests/test_staging/test_staging_helpers.py Outdated Show resolved Hide resolved
tests/test_staging/test_staging_helpers.py Outdated Show resolved Hide resolved
tests/test_staging/test_staging_helpers.py Outdated Show resolved Hide resolved

@patch("src.utils.local_file_mods.load_local_json")
@patch("src.staging.validation.validate_data_with_schema")
@patch("src.staging.spp_snapshot_processing.full_responses")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

def test_load_validate_secondary_snapshot(
self,
mock_combine_schemas_validate_full_df,
mock_full_responses,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

def test_load_val_snapshot_json_correct_response_rate(
self,
mock_combine_schemas_validate_full_df,
mock_full_responses,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I think the mock for the function full_responses is being used


@patch("src.utils.local_file_mods.load_local_json")
@patch("src.staging.validation.validate_data_with_schema")
@patch("src.staging.spp_snapshot_processing.full_responses")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being used

@AnneONS
Copy link
Collaborator

AnneONS commented Jun 26, 2024

HI @CBROWN-ONS, thanks for your review and writing out your thoughts clearly.
The main reason we're keen on mocking in RDSA is because it's needed when you're developing on hdfs, as we were at the beginning of this project.
I recently updated the devtest version of the code so it would be ready for transfer to the Cloud, and George is keen our project can be used as a test case for the important DAP to Cloud transfer.
Unfortunately the unit tests you created in the output modules don't work in devtest, there is a permissions error.

I could show you this if you are interested.

@JenCheshire JenCheshire marked this pull request as draft June 26, 2024 14:07
@JenCheshire JenCheshire marked this pull request as ready for review August 9, 2024 14:41
@JenCheshire JenCheshire marked this pull request as draft August 9, 2024 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants